Normalizing Constrained Symbolic Data for Clustering

نویسندگان

  • Marc Csernel
  • Francisco de A. T. de Carvalho
چکیده

Clustering is one of the most common operation in data analysis while constrained is not so common. We present here a clustering method in the framework of Symbolic Data Analysis (S.D.A) which allows to cluster Symbolic Data. Such data can be constrained relations between the variables, expressed by rules which express the domain knowledge. But such rules can induce a combinatorial increase of the computation time according to the number of rules. We present in this paper a way to cluster such data in a quadratic time. This method is based first on the decomposition of the data according to the rules, then we can apply to the data a clustering algorithm based on dissimilarities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Expert Constrained Clustering: A Symbolic Approach

A new constrained model is discussed as a way of incorporating efficiently a priori expert knowledge into a clustering problem of a given individual set. The first innovation is the combination of fusion constraints, which request some individuals to belong to one cluster, with exclusion constraints, which separate some individuals in different clusters. This situation implies to check the exis...

متن کامل

Fuzzy clustering algorithms for mixed feature variables

This paper presents fuzzy clustering algorithms for mixed features of symbolic and fuzzy data. El-Sonbaty and Ismail proposed fuzzy c-means (FCM) clustering for symbolic data and Hathaway et al. proposed FCM for fuzzy data. In this paper we give a modi3ed dissimilarity measure for symbolic and fuzzy data and then give FCM clustering algorithms for these mixed data types. Numerical examples and ...

متن کامل

Wasserstein Metric Based Adaptive Fuzzy Clustering Methods for Symbolic Data

Given the current limitations in fuzzy clustering metric, the aim of this paper is to present new wasserstein metric based adaptive fuzzy clustering methods for partitioning symbolic interval data. Wasserstein metric shows adavantages in digging distribution information in symbolic interval data. Besides, the proposed fuzzy clustering methods also emphasize correlation structure between indices...

متن کامل

Clustering Trees with Instance Level Constraints

Constrained clustering investigates how to incorporate domain knowledge in the clustering process. The domain knowledge takes the form of constraints that must hold on the set of clusters. We consider instance level constraints, such as must-link and cannot-link. This type of constraints has been successfully used in popular clustering algorithms, such as k-means and hierarchical agglomerative ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011